Data Weeding Techniques Applied to Roget's Thesaurus

نویسندگان

  • Uta Priss
  • L. John Old
چکیده

It can be difficult to automatically generate “nice” graphical representations for concept lattices from lexical databases, such as Roget’s Thesaurus, because the data sources tend to be large and complex. This paper discusses a variety of “data weeding” techniques that can be applied in order to reduce the size of a concept lattice, first, in general and then with respect to Roget’s Thesaurus. The aim is that resulting lattices should display neither too much, nor too little information, independently of which search terms have been entered by a user.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity

This paper presents the results of using Roget's International Thesaurus as the taxonomy in a semantic similarity measurement task. Four similarity metrics were taken from the literature and applied to Roget's. The experimental evaluation suggests that the traditional edge counting approach does surprisingly well (a correlation of r=0.88 with a benchmark set of human similarity judgements, with...

متن کامل

A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity

This paper presents the results of using Roget’s International Thesaurus as the taxonomy in a semantic similarity measurement task. Four similarity metrics were taken from the literature and applied to Roget’s. The experimental evaluation suggests that the traditional edge counting approach does surprisingly well (a correlation of r=0.88 with a benchmark set of human similarity judgements, with...

متن کامل

Evaluation of Automatic Updates of Roget's Thesaurus

abstract Keywords: lexical resources, Roget's Thesaurus, WordNet, semantic relatedness, synonym selection, pseudo-word-sense disambiguation, analogy Thesauri and similarly organised resources attract increasing interest of Natural Language Processing researchers. Thesauri age fast, so there is a constant need to update their vocabulary. Since a manual update cycle takes considerable time, autom...

متن کامل

Disambiguating Hypernym Relations for Roget's Thesaurus

Roget’s Thesaurus is a lexical resource which groups terms by semantic relatedness. It is Roget’s shortcoming that the relations are ambiguous, in that it does not name them; it only shows that there is a relation between terms. Our work focuses on disambiguating hypernym relations within Roget’s Thesaurus. Several techniques of identifying hypernym relations are compared and contrasted in this...

متن کامل

Roget2000: a 2D hyperbolic tree visualization of Roget's Thesaurus

Thesauri, such as Roget’s Thesaurus, show the semantic relationships among terms and concepts. Understanding these relationships can lead to a greater understanding of linguistic structure and could be applied to creating more efficient natural-language recognition and processing programs. A general assumption is that focus and context displays of hyperbolic trees accelerate browsing ability ov...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007